Conical Dimension as an Intrinsic Dimension Estimator and its Applications

نویسندگان

  • Xin Yang
  • Sebastien Michea
  • Hongyuan Zha
چکیده

Estimating the intrinsic dimension of a high-dimensional data set is a very challenging problem in manifold learning and several other application areas in data mining. In this paper we introduce a novel local intrinsic dimension estimator, conical dimension, for estimating the intrinsic dimension of a data set consisting of points lying in the proximity of a manifold. Under minimal sampling assumptions, we show that the conical dimension of sample points in a manifold is equal to the dimension of the manifold. The conical dimension enjoys several desirable properties such as linear conformal invariance and it can also handle manifolds with self-intersections as well as detect the boundary of manifolds. We develop algorithms for computing the conical dimension paying special attention to the numerical robustness issues. We apply the proposed algorithms to both synthetic and real-world data illustrating their robustness on noisy data sets with large curvatures.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Conical dimension as an intrisic dimension estimator and its applications

Estimating the intrinsic dimension of a high-dimensional data set is a very challenging problem in manifold learning and several other application areas in data mining. In this paper we introduce a novel local intrinsic dimension estimator, conical dimension, for estimating the intrinsic dimension of a data set consisting of points lying in the proximity of a manifold. Under minimal sampling as...

متن کامل

Regularized Maximum Likelihood for Intrinsic Dimension Estimation

We propose a new method for estimating the intrinsic dimension of a dataset by applying the principle of regularized maximum likelihood to the distances between close neighbors. We propose a regularization scheme which is motivated by divergence minimization principles. We derive the estimator by a Poisson process approximation, argue about its convergence properties and apply it to a number of...

متن کامل

Maximum Likelihood Estimation of Intrinsic Dimension

We propose a new method for estimating intrinsic dimension of a dataset derived by applying the principle of maximum likelihood to the distances between close neighbors. We derive the estimator by a Poisson process approximation, assess its bias and variance theoretically and by simulations, and apply it to a number of simulated and real datasets. We also show it has the best overall performanc...

متن کامل

Dimension Estimation Using Random Connection Models

Information about intrinsic dimension is crucial to perform dimensionality reduction, compress information, design efficient algorithms, and do statistical adaptation. In this paper we propose an estimator for the intrinsic dimension of a data set. The estimator is based on binary neighbourhood information about the observations in the form of two adjacency matrices, and does not require any ex...

متن کامل

Fractal Dimension Pattern Based Multiresolution Analysis for Rough Estimator of Person-Dependent Audio Emotion Recognition

As a general means of expression, audio analysis and recognition has attracted much attentions for its wide applications in real-life world. Audio emotion recognition (AER) attempts to understand emotional states of human with the given utterance signals, and has been studied abroad for its further development on friendly human-machine interfaces. Distinguish from other existing works, the pers...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007